Adaptive method and software architecture for efficient transaction processing and error management

ABSTRACT

A new type of transaction manager is disclosed that provides a unique set of methods and components for efficient transaction processing, error management, and transaction recovery. The combination of these methods and components are applicable to a wide range of business and technical scenarios that do not lend themselves to traditional transaction processing methods, permitting a degree of automation and robustness hitherto impossible. The methods extend and generalize the traditional transaction properties of atomicity, consistency, isolation, and durability.

CROSS REFERENCE TO RELATED APPLICATIONS

This is a division in part of Ser. No. 10/263,589, filed on Oct. 2,2002. The USPTO issued a restriction requirement on Jan. 12, 2006requiring the prosecution of either claims 93-181, which invention wasclassified as belonging to class 707, subclass 8; or claims 182-184,which invention was classified as belonging to class 707, subclass 202.Prosecution of claims 93-181 of the first invention continued under theabove-referenced application and serial number. This divisionalapplication is filed to continue the prosecution, separately, of theinvention described in claims 182-184, and expressly incorporates bothbelow and by reference all of the original, pre-divisional application'sspecification and drawings.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT

Not Applicable

DESCRIPTION OF ATTACHED APPENDIX

Not Applicable

BACKGROUND OF THE INVENTION

A transaction can be defined as a set of actions on a set of resourcesor some subset thereof, said actions including changes to thoseresources. The initial state of a set of resources that will be changedby a transaction is defined as being consistent, and so eitherimplicitly or explicitly satisfy a set of consistency conditions (a.k.a.constraints or integrity rules). Each particular transaction includesone or more operations that may alter the resources (e.g. addition,subtraction, selection, exchange, or transformation). Once defined, thetransaction creates a delimitible set of changes from initialconditions. Each change to the resources (short of the final change)creates an intermediate state of those resources, which often are notintended to be accessible to other transactions.

Under such an implementation, each transaction operates on the set ofresources in an initial state and, after any operations performed by thetransaction, leaves the set of resources in a final state. Thus atransaction may be viewed as a means of transforming a set of resourcesfrom an initial consistent state to a final consistent state (possibly,but generally not the same as the initial).

Transaction processing is subject to multiple difficulties. Atransaction may use resources inefficiently. Transactions may fail tocomplete operations as designed. Errors may cause the final state to beinconsistent. Transactions may execute too slowly. Such difficulties canbe handled manually if the environment is simple enough. Automated orsemi-automated means (as supplied, for example, by a transactionmanagement facility) are required in more sophisticated situations.

An environment in which transactions operate is often subject to atransaction management facility, often referred to simply as a“transaction manager.” The responsibility of a transaction manager is toensure the initial and final states are consistent and that no harmfulside effects occur in the event that concurrent transactions shareresources (isolation). A transaction manager typically enforces theisolation of a specific transaction using a default concurrency controlmechanism (e.g., pessimistic or optimistic). If a condition such as anerror occurs before the final state is reached, it is often theresponsibility of a transaction management facility to return the systemto the initial state. This sort of automated transaction processing liesbehind the greatest volume of financial and commercial transactionsextant in modern society.

Automated transaction processing, both with and without transactionmanagement facilities, has been designed traditionally with an unspokenassumption that errors are exceptional. The programming, both its designand coding, focuses on implementing transactions in a near-perfect worldwhere it is permissible to simply start over and redo the work ifanything goes wrong. Even if this were to model accurately the majorityof automated commercial transactions, it would not reflect the entiretyof any business's real world experience. In the real world, eightypercent or more of the management effort and expertise is about handlingexceptions, mistakes, and imperfections. In automated transactionprocessing, error recovery mechanisms are usually seen as anafterthought, a final ‘check-box’ on the list of features andtransactions that can be handled (if all goes perfectly).

A naïve approach to the implementation of complex automated transactionprocessing systems maintains that the system resulting from integrating(via transactional messaging) a set of applications that already haveerror recovery mechanisms will itself recover from errors. Experienceand careful analysis have shown that nothing could be further from thetruth. As more and more business functions are integrated, the problemsof automated error recovery become increasingly important and complex.Errors can propagate just as rapidly as correct results, but theconsequences can be devastating.

As more and more business functions are integrated, the problems ofautomated error recovery and resource management become increasinglyimportant. It's only natural that many of the systems that a businessautomates first are deemed by that business to enable the execution ofits core competencies, whose completion is ‘mission critical’.Automation demands the reliability we associate with transactionmanagement if error recovery is to be robust. With each success atautomating a particular business transaction, the value of connectingand integrating disparate automated transactions increases. Separatetransactions, each of them simple, when connected become a complextransaction. With each integrative step, the need for acceptable errorrecovery becomes ever more important.

Traditional approaches to automated transaction management emphasizemeans to guarantee the fundamental properties of a properly defined or‘formal’ transaction, which are atomicity, consistency, isolation, anddurability. These properties are usually referred to by their acronym,ACID. Transactions, especially if complex, may share access to resourcesonly under circumstances that do not violate these properties, althoughthe degree to which transaction management facilities strictly enforcethe isolation property is often at the discretion of the user.

It is not uncommon to refer to any group of operations on a set ofresources (i.e., a unit of work) as a transaction, even if they do notcompletely preserve the ACID properties. In keeping with this practice,we will use the term transaction without a qualifying adjective or othermodifier when referring a unit of work of any kind whether formal ornot. We will use the qualified term pseudo-transaction when we want torefer specifically to a unit of work that does not preserve all of theACID properties, although it may preserve some of them.Pseudo-transactions exist for a variety of reasons including thedifficulty of proper transaction design and enforcement, incompleteknowledge of consistency rules, attempts to increase concurrency at theexpense of decreased isolation, attempts to increase performance at theexpense of atomicity, and so on.

The ACID properties lead to a very specific behavior when one or more ofthe elements that compose a transaction fail in a manner that cannot betransparently recovered (a so-called “unrecoverable error”): theatomicity property demands that the state of the resources involved berestored so that it is as though no changes whatsoever had been made bythe transaction. Thus, an unrecoverable error always results intransitioning to the initial state (i.e., the initial state beingrestored), the typical process for achieving this being known as“rollback.” An alternative method of restoring the initial state is torun an “undo” or “inverse” transformation known as a compensatingtransaction (discussed in more detail below). This of course presumesthat for such mandated compensating transactions, for every error it ispossible to first identify the class of error, then most suitablecompensating transaction, and finally to implement that compensatingtransaction. A problem with the current approach to enforcing atomicityis that viable work is often wasted when the initial state is recovered.A second problem is that transactions dependent on a failed transactioncannot begin until the failed transaction is resubmitted and finallycompletes, thereby possibly resulting in excessive processing times andperhaps ultimately causing a failure to achieve the intended businesspurpose.

The consistency property guarantees the correctness of transactions byenforcing a set of consistency conditions on the final state of everytransaction. Consistency conditions are usually computable, which meansthat a software test is often executed to determine whether or not aparticular consistency condition is satisfied in the current state.Thus, a correctly written transaction becomes one which, when applied toresources in a first consistent state, transforms those resources into asecond (possibly identical) consistent state. Intermediate states,created as the component operations of a transaction are applied toresources, may or may not satisfy a set of consistency conditions and somay or may not be a consistent state. A problem with this approach isthat consistency must be either cumulative during the transaction, orelse enforced at transaction completion. In most cases, transactions areassumed to be written correctly and the completion of a transaction issimply assumed to be sufficient to insure a consistent state. This leadsto a further problem: the interactions among a collection oftransactions that constitute a complex transaction may not result in aconsistent state unless all consistency rules are enforced automaticallyat transaction completion.

For complex transactions that share resources, the isolation propertyfurther demands that concurrent or dependent transactions behave asthough they were run in isolation (or were independent): that is, noother transaction can have seen any intermediate changes (there are no“side effects”) because these might be inconsistent. The usual approachto ensuring the isolation property is to lock any resource that istouched by the transaction, thereby ensuring that other transactionscannot modify any such resource (a share lock) and cannot accessmodified resources (an exclusive lock). With regard to resourcemanagement, locking is used to implement a form of dynamic scheduling.The most commonly used means for ensuring this is implementing the ruleknown as “two-phase locking” wherein while a transaction is processing,locks on resources accessed by that transaction are acquired duringphase one and are released only during phase two, with no overlap inthese phases. Such an implementation guarantees that concurrent ordependent transactions can be interleaved while preserving the isolationproperty. A problem with this approach is that it necessarily increasesthe processing time of concurrent transactions that need to access thesame resources, since once a resource is locked, it may not be modifiedby any other transaction until the locking transaction has completed.Another problem due to this approach is that it occasionally creates adeadly embrace or deadlock condition among a group of transactions. Inthe simplest case of the group consisting of only two transactions, eachof the two transactions wait indefinitely for a resource locked by theother. Deadlock conditions can arise in complex ways among groups ofmore than two transactions. Other approaches to maintaining theisolation property include optimistic concurrency (such as timestamping) and lock or conflict avoidance (such as static scheduling viatransaction classes or conflict graphs, nested transactions, andmulti-versioning). Various caching schemes have been designed to improveconcurrency by minimizing the time required to access a resource, whilerespecting a particular approach to enforcing the isolation property.Each of the existing approaches to enforcing isolation, and theassociated techniques and implications for resource management, fails tomeet the needs imposed by complex, possibly distributed, businesstransactions.

If no error occurs, the completion of the transaction guarantees notonly a consistent state, but also a durable one (the durabilityproperty) through a process known as “commit.” The step in a transactionat which a “commit” is processed is known as the commit point. Thedurability property is intended to guarantee that the specific result ofa completed transaction can be recovered at a later time, and cannot berepudiated. Ordinarily, the durability property is interpreted asmeaning that the final state of resources accessed by a transaction is,in effect, recorded in non-volatile storage before confirming thesuccessful completion of the transaction. Usually, this is done byrecording some combination of resource states, along with the operationsthat have been applied to the resources in question. The software thathandles this recording is called a resource manager.

A variant of the commit point, in which a user (possibly via programcode) asserts to the transaction manager that they wish to make the thencurrent state recoverable and may subsequently wish to rollback work tothat known state, is known as a savepoint. Because savepoints arearbitrarily defined, they need not represent a consistent state.Furthermore, the system will return to a specific savepoint only at theexplicit request of the user. Typically, savepoints are not durable.Savepoints cannot be asserted automatically by the system except in themost rudimentary fashion as, for example, after every operation orperiodically based on elapsed time or quantity of resources used. Noneof these approaches enable the system to determine to which savepoint itshould rollback.

When the elements of a transaction are executed (whether concurrent orsequential) under multiple, independent resource managers, the rollbackand commit processes can be coordinated so that the collection behavesas though it were a single transaction. In essence, the elements areimplemented as transactions in their own right, but are logicallycoupled to maintain ACID properties to the desired degree for thecollection overall. Such transactions are called distributedtransactions. The usual method for achieving this coordination is calledtwo-phase commit. Unfortunately, this is an inefficient process whichtends to reduce concurrency and performance, and cannot guaranteecoordination under all failure conditions. Under certain circumstances,a system failure during two-phase commit can result in a state that isincorrect and that then requires difficult, costly, and time-consumingmanual correction during which the system is likely to be unavailable.As with single transactions, compensating transactions can sometimes beused to restore the initial state of a collection of logically coupledtransactions. In such cases, it may be necessary to run specialcompensating transactions that apply to the entire collection oftransactions (known as a compensation sphere whether or not thecollection is a distributed transaction).

There are numerous optimizations and variations on these techniques,including split transactions, nested transactions, and the like. Inpractice, all these approaches have several disadvantages (and differfrom the present invention):

Poor concurrency due to locking is common;

the cost of rollback, followed by redoing the transaction, can beexcessive;

the conditions of consistency, isolation, and durability are tightlybound together;

logically dependent transactions must either (a) be run sequentiallywith the possibility that an intervening transaction will alter thefinal state of the first transaction before the second transaction cantake over, or (b) be run together as a distributed transaction, therebylocking resources for a much longer time and introducing two-phasecommit performance and concurrency penalties;

there is significant overhead in memory and processing costs on alreadycomplex transactions;

the errors which are encountered and identified are not recorded (whichcan complicate systematic improvement of a system);

it is often undesirable in a business scenario to return a set ofresources to some prior state, especially when a partially ordered setof interdependent transactions (i.e., a business process) has been run;

it is not always possible to define a compensating transaction for agiven transaction, and the best compensating transaction often dependson context;

business transactions may result in very long times from start tocompletion, and may involve many logically coupled transactions,possibly each running under separate transaction or resource managers;and, finally,

the transaction manager will not be able to compensate for or recoverfrom certain context-dependent, external actions that affect resourcesexternal to the resource manager.

Transactions can be classified broadly into three types, withcorresponding qualifiers or adjectives: physical, logical, and business.A physical transaction is a unit of recovery; that is, a group ofrelated operations on a set of resources that can be recovered to aninitial state as a unit. The beginning (and end) of a physicaltransaction is thus a point of recovery. A physical transaction shouldhave the atomicity and durability properties. A logical transaction is aunit of consistency; that is, a group of related operations on a set ofresources that together meet a set of consistency conditions andconsisting of one or more coordinated physical transactions. Thebeginning (and end) of a logical transaction is a point of consistency.In principle, logical transactions should have the ACID properties. Abusiness transaction is a unit of audit; that is, a group of relatedoperations on a set of resources that together result in an auditablechange and consisting of one or more coordinated transactions. If, as isthe ideal construction, each of these component transactions are logicaltransactions, business transactions combine to form a predictable,well-behaved system. The beginning and end of a business transaction arethus audit points, by which we mean that an auditor can verify thetransaction's identity and execution. Audit information obtained mightinclude identifying the operations performed, in what order (to thedegree it matters), by whom, when, with what resources, that preciselywhich possible decision alternatives were taken in compliance with whichrules, and that the audit system was not circumvented. Businesstransactions can be composed of other business transactions. Time spansof a business transaction can be as short as microseconds or spandecades (e.g., life insurance premium payments and eventual disbursementwhich must meet the consistency conditions imposed by law and policy).

The efficiency, correctness, and auditability of automated businesstransactions have a tremendous influence on a business' profitability.As transaction complexity increases, the impact of inefficiencies anderrors increases combinatorially.

There are at least four general classes of ways that transactions can becomplex. First, a transaction may involve a great deal of detail in itsdefinition, each step of which may be either complex or simple, and mayinherently require considerable time to process. Even if each individualstep or operation is simple, the totality of the transaction may exceedthe average human capacity to understand it in detail—for example,adding the total sum of money paid to a business on a given day, whenthe number of inputs are in the millions. This sort of complexity isinherently addressed (to the degree possible) by automation, and byfollowing the well-known principles of good transaction design.

Second, a transaction may be distributed amongst multiple, separateenvironments, each such environment handling a sub-set of the totaltransaction. The set of resources may be divisible or necessarilyshared, just as the processing may be either sequential or concurrent,and may be dependent or independent. Distributed transactions inherentlyimpose complexity in maintaining the ACID properties and on errorrecovery.

Third, a transaction may be comprised of multiple, linkedtransactions—for example, adding all of the monies paid in together,adding all of the monies paid out together, and summing the two, toestablish a daily net cashflow balance for a company. Such joinedtransactions may include as a sub-transaction any of the three complextransactions (including other joined transactions, in recursiveiteration). And, of course, linked transactions may then be furtherjoined, theoretically ad infinitum. Each sub transaction is addressed asits own transaction, and thus is handled using the same means anddefinitiveness. Linked transactions can become extremely complex due tothe many ways they can be interdependent, thus making their design,maintenance, and error management costly and their use risky. Tremendouscare must be taken to keep complexity under control.

Fourth, and last, a transaction may run concurrently in a mix oftransactions (physical, logical, business, and pseudo). As the number ofconcurrent transactions, the number of inter-dependencies, or the speedof processing increase, or as the available resources decrease, thebehavior of the transaction becomes more complex. Transaction managers,careful transaction design, and workload scheduling to avoid concurrencyare among the methods that are used to manage this type of complexity,and provide only limited relief. Part of the problem is that the groupbehavior of the mix becomes increasingly unpredictable, and thereforeunmanageable, with increasing complexity.

A business process may be understood as consisting of a set ofpartially-ordered inter-dependent or linked transactions (physical,logical, business, and pseudo), sometimes relatively simple andsometimes enormously complex, itself implementing a businesstransaction. The flow of a business process may branch or merge, caninvolve concurrent activities or transactions, and can involve eithersynchronous or asynchronous flows. Automated business process managementis rapidly becoming the principal means of enabling business integrationand business-to-business exchanges (e.g., supply chains and tradinghubs).

Knowledge of both the internal logical structure of transactions and theinter-relationships among a group of transactions is often representedin terms of an inter-connected set of dependencies. Two types ofdependency are important here: semantic and resource. If completion ofan operation (or transaction) A is a necessary condition for the correctcompletion of some operation (or transaction) B, B is said to havesemantic dependency on A. If completion of an operation (or transaction)T requires some resource R, transaction T is said to have a resourcedependency on the resource R. Resource dependencies become extremelyimportant to the efficiency of transaction processing, especially if theresource cannot be shared (that is, if a principle of mutual exclusionis either inherent or enforced). In such cases, transactions (oroperations) that depend on the resource become serialized on thatresource, and thus, transactions that require the resource depend on(and wait for) the completion of transaction that has the resource.

Dependencies are generally depicted via a directed graph, in which thenodes represent either transactions or resources and arrows representthe dependency relationship. The graph that represents transactions thatwait for some resource held by another transaction, for example, iscalled a “wait graph.” Dependency graphs may be as simple as adependency chain or even a dependency tree, or may be a very complex,and non-flat network.

The value of successfully managing complexity through automated meansgrows as the transactions being managed become more complex, as thisuses computerization's principal strength: the capacity for managingtremendous amounts of detail, detail that would certainly overwhelm anysingle human worker, and threaten to overwhelm a human organization notequipped with computer tools.

Unfortunately, the cost of any error that may propagate, for example,down a dependency chain of simple transactions, or affect a net ofdistributed transactions, also increases. Moreover, the cost ofidentifying possible sources of error increases as the contextualbackground for a complex transaction broadens, as all elements,assumptions, and consequences of particular transition states that maybe visited while the transaction is processing must be examined forerror. One certainty is that the law of unintended consequences operateswith harsh and potentially devastating impact on program designers andusers who blithely assume that their processes will always operateexactly as they are intended, rather than exactly according to what theyare told (and sometimes more telling, not told) to do.

Error-handling for complex transactions currently operates with a biastowards rescinding a flawed transaction and restoring the originalstarting state. Under this approach, only when a transaction hassuccessfully and correctly completed is the computer program grantedpermission to commit itself to the results and permanently accept them.If an error occurs, then the transaction is rolled back to the startingpoint and the data and control restored. This “either commit orrollback” approach imposes a heavy overhead load on complex transactionprocessing. If the complex transaction is composed of a chain of single,simpler transactions, then the entire chain must be rolled back to thedesignated prior commit point. All of the work done between the priorcommit point and the error is discarded, even though it may have beenvalid and correct. If the complex transaction is a distributed one, thenall resources used or affected by the transaction must be tracked andblocked from other uses until a transaction has successfully attainedthe next commit point; and when a single part of the entire distributedtransaction encounters an error, all parts (and the resources used) mustbe restored to the values established at the prior commit point. Again,the work that has been successfully performed, even that which is notaffected by the error, must be discarded. With linked transactions orany mix involving possibly interdependent pseudo-transactions, nogeneral solution to the problem of automated error recovery hasheretofore been presented.

Furthermore, the standard approach treats all transactional operationsas identical. Operations, however, differ as to their reversibility,particularly in computer operations. Addition of zero may be reversibleby subtracting zero. But multiplication by zero, even though the resultis boring, is not exactly reversible by division by zero. Non-commutabletransactions are not differentiated from commutable ones, nor do theyhave more stringent controls placed around their inputs and operation.

A second method currently used for error-handling in complextransactions is the application, after an error, of a pre-establishedcompensatory mechanism, also called (collectively) compensatingtransactions as noted above. This presumes that all errors experiencedcan be predetermined, fit into particular categories, and a propermethod of correction devised for each category. Using compensatingtransactions introduces an inherent risk of unrecoverable error:compensating transaction may themselves fail. Dependence entirely oncompensating transactions risks the imposition of a Procrustean solutionon a correct transaction that has been mistakenly identified aserroneous, or even on an erroneous transaction where the correctionasserted becomes worse than the error.

Inherent in the use of compensating transactions is an assumption thateach individually defined transaction has a matching transaction (the“compensating transaction”) that will “undo” any work that the originaltransaction did. When transactions are treated in isolation or areapplied sequentially, it is pretty easy to come up with compensatingtransactions. All that is needed is the state of the system saved fromthe beginning of the transaction and a function to restore that state.(In essence, this is how one recovers a file using a backup copy. Allthat is lost is the intermediate correct stages between preparation ofthe backup and the occurrence of the error.) When transactions becomeinterleaved, this simplistic notion of a compensating transaction nolonger works and the implementation a bit trickier. In fact, acompensating transaction may not even exist for certain transactions.The compensating transaction may be selected and applied automaticallyby the transaction manager. Still, the process is much the same: thesystem is ultimately returned to an earlier state or its equivalent.

Automated support for compensating transactions requires that, for eachtransaction, a corresponding compensating transaction be registered withan error management system so that recovery can take place automaticallyand consistently. The rules for using compensating transactions becomemore complex as the transaction model departs further from the familiar“flat” model. Formally, compensating transactions should always return asystem to a prior state. If multiple systems are recovered, they are allrecovered to prior states that share a common point in time. If theatomic actions that make up a transaction can be done in any order, andif each of these has an undo operation, then such a compensatingtransaction can always be defined. Three guidelines have been published(McGoveran, 2000): (1) Try to keep the overall transaction model asclose as possible to the traditional “flat” model or else a simplehierarchy of strictly nested transactions. (2) Design the atomic actionsso that order of application within a transaction does not matter. (3)Make certain that compensating transactions are applied in the rightorder.

A transaction logically consists of a begin transaction request, a setof steps or operations, each typically (though not necessarily)processed in sequential order of request and performing somemanipulation of identified resources, and a transaction end request(which may be, for example, a commit, an abort, a rollback to namedsavepoint, and the like). Because the state of the art typicallyprocesses each step in the order received, the management of affectedresources is largely cumulative rather than either pre-determined orpredictive, even when the entire transaction is submitted at one time.Resource management, and in particular the scheduling of both concurrenttransactions and the operations of which they are composed, may beeither static or dynamic. Static scheduling uses various techniques suchas conflict graphs to determine in advance of execution whichtransactions and operations may be interleaved or run concurrently.Dynamic scheduling uses various techniques such as locking protocols todetermine at execution time which transactions and operations may beinterleaved or run concurrently.

SUMMARY OF THE INVENTION

As outlined above, the usual interpretation of the ACID propertiesintroduces a number of difficulties. The current interpretation of theatomicity property has resulted in an approach to error recovery that iscostly in terms of both time and other resources in that it requires theability to return affected resources to an initial state. The currentinterpretation of the consistency property recognizes consistent statesonly at explicit transaction boundaries, resulting in excessiveprocessing at the end of a transaction and increased chance of failure.The isolation property is interpreted as strictly precluding the sharingof modified resources and operations, so that performance is affectedand certain operations may be performed redundantly even when they areidentical. Finally, the durability property is generally interpreted asrequiring a hard record of only the final state of a transaction'sresources (or its equivalent), thereby sometimes requiring excessiveprocessing at commit or rollback. All of these taken together result inless than optimal use of resources and inefficient error recoverymechanisms. The traditional techniques for preserving the ACIDproperties, optimizing resource usage, and recovering from errors cannotbe applied effectively in many business environments involving complextransactions, especially those pertaining to global electronic commerceand business process automation.

The current invention introduces a method of transaction processing,comprised of a set of sub-methods which preserve the ACID propertieswithout being restricted by the traditional interpretations. The conceptof atomicity is refined to mean that either all effects specific to atransaction will complete or they will all fail. The concept ofconsistency is refined to mean that whenever a class of consistencyconditions apply to two states connected by a set of operations whichare otherwise atomic, isolated, and durable as defined here, that set ofoperations constitute an implicit transaction. The isolation property isrefined to mean that no two transactions produce a conflicting orcontradictory effect on any resource on which they are mutually andconcurrently (that is, during the time they are processed) dependent.The durability property is refined to mean that the final state of atransaction is recoverable insofar as that state has any effect on theconsistency of the history of transactions as of the time of recovery.Thus, if the recovered state differs from the final state in any way,the durability property is a guarantee that all those differences areconsistent with all other recovered states and external effects of thetransaction history. Finally, a logical transaction is understood as atransition from one state in a class of consistent states to a state inanother class of consistent states. This is similar to, but clearlydistinct from, the concept that the interleaved operations of a set ofserializable, concurrent transactions produces a final result that isidentical to at least one serial execution of those transactions. Justas serializability provides no guarantee as to which apparent orderingof the transactions will result, so the new understanding of a logicaltransaction provides no guarantee as to which consistent state in theclass of achievable states will result.

The present invention asserts that these refinements of the ACIDproperties and of logical transactions permit a more realistic computerrepresentation of transaction processing, especially businesstransaction processing. Furthermore, these refinements permittransaction processing methods that include both the traditional methodsand the sub-methods described in this invention. The new set ofsub-methods used, both individually and together, make it possible tomanage complex transactional environments, while optimizing the use ofresources in various ways. These techniques extend to distributedtransactions, and to business transactions which span both multipleindividual transactions as, for example, in a business process, andmultiple business entities as is required in electronic commerce andbusiness-to-business exchanges.

In particular, these sub-methods include: (1) establishing and usingconsistency points which minimizes the cost of recovery under certaintypes of error; (2) transaction relaying which permits work sharingacross otherwise isolated transactions, while simultaneously minimizingthe impact of failures; (3) corrective transactions which permit errorrecovery without unnecessarily undoing work, without so-calledcompensating transactions, and while enabling the tracking andcorrelation of errors and their correction; (4) lookahead-based resourcemanagement based on dependencies which enables optimized resource usagewithin and among transactions; and, (5) dependency-based concurrencyoptimization which enables optimized scheduling and isolation oftransactions while avoiding the high cost of locking and certain otherconcurrency protocols wherever possible. Each of these sub-methods iscapable of being used in complex transaction environments (includingdistributed, linked, and mixed) while avoiding the overhead associatedwith traditional transaction management techniques such as two-phasecommit, each can be used in combination with the others, and each ofthese are detailed in the description of the invention below.

Two of the sub-methods introduced here, consistency points andcorrective transactions, address the problem of error recovery andcorrection. Consistency points differ from savepoints in that they addthe requirement of a consistent state, possibly automatically detectedand named. Corrective transactions differ from compensating transactionsin that they effectively enfold both error repair and the correction,whereas compensating transactions address only error repair. One problemwith the current approaches to handling errors that occur during complexor distributed transactions is that they fail almost as often as theysucceed. A second problem is that they are difficult for the humanindividuals who experience both the problem and the correction, becausethey do not meet peoples' expectations of how the real world handlesproblems. A third problem is that they do not offer an opportunity torecord both the error and the correction applied, which makes adaptiveimprovements harder to derive as much of the value of the experience(how the mistake was made and how it was corrected) is discarded afterthe correction is completed. A fourth problem is that they arerelatively inefficient. Jointly, consistency points and correctivetransactions overcome these problems.

The transaction relaying sub-method provides a means for efficient,consistent management of inter-dependent transactions without violatingatomicity or isolation requirements, without introducing artificialtransaction contexts, and while enabling resource sharing. Currentapproaches for linking inter-dependent transactions (through, forexample, a single distributed transaction with two-phase commit, aschained transactions, or through asynchronous messaging) do notsimultaneously insure ACID properties and efficient, manageable errorrecovery. One problem with current approaches is the high resource costof ensuring consistency and atomicity (the later becoming a somewhatartificially expanded requirement). A second problem is the high cost oferror recovery, inasmuch as the approach introduces difficult to managefailure modes, most of which are incompatible with the sub-method ofcorrective transactions introduced here. A third problem is that theapproach, in an attempt to avoid the high overhead of distributedtransactions, may permit inconsistencies. A fourth problem is that theymay be compatible only with flat transaction models, while requiredbusiness transactions and business processes cannot be implemented usinga flat transaction model. Transaction relaying overcomes these problems.

The remaining two sub-methods, lookahead-based resource management, anddependency-based concurrency optimization, each enable efficient use ofresources, especially in highly concurrent environments. One problemwith current approaches is that they do not make good use of informationknown in advance of transaction or operation execution, but dependprimarily on dynamic techniques with the result that hand-codedsolutions may perform more efficiently. A second problem is that theymay not be compatible with the method (or the individual sub-methods)introduced here, hence an alternative approach to resource managementand concurrency optimization is required to make the other newsub-methods viable. Lookahead-based resource management anddependency-based concurrency optimization address these problems.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a transaction state graph contrasting transaction processingerror recovery, with and without consistency points.

FIG. 2 is a transaction state graph illustrating a correctivetransaction.

FIG. 3 is a transaction state graph illustrating transaction relaying.

In FIG. 1-3, the thicker lines indicate the intended, error-free flow ofwork, while the thinner lines indicate corrective or ameliorativeefforts once an error occurs.

FIG. 4 is an example of code reorganization and optimization usinglookahead resource management.

FIG. 5 is a transaction state graph illustrating an example (onepossible alternative out of many) of dependency-based concurrencycontrol.

FIG. 6 is an overview of a component combination for the jointapplication of the submethods, implemented in an ATM.

DETAILED DESCRIPTION OF THE DRAWINGS

FIG. 1: At time t1 (1), a transaction is begun and the current state iseffectively saved. A portion of work is done between t1 (1) and t2 (2)and another portion of work is done between t2 (2) and t3 (3). At timet4 (4) and before the transaction can reach its intended completionstate (5) an error is detected. Without consistency points, the ATMinitiates a rollback (7) and restores the initial state (1) at time t5,effectively losing all the work done prior to time t4 (4). The entiretransaction must now be redone.

By contrast, if the transaction manager detects and saves a consistencypoint at time t3 (3), the ATM initiates a lesser rollback (6) andrestores the saved consistency point (3) at time t5. The work donebetween t1 (1) and t3 (3) is preserved, and only the work done aftertime t3 (3) and prior to time t4 (4) is lost and must be redone.

FIG. 2: Transaction A begins at consistency point CP0 (8), transitioningstate through consistency points CP1 (9) and CP2 (10); then TransactionA commits and Transaction B begins. Transaction B encounters anundesirable condition E1 (11) before it can transition to consistencypoint CP3 (12) and commit. The ATM determines that condition E1 (11) isassociated with consistency points of category C1, and that only CP1 (9)of prior consistency points CP0, CP1, and CP2 belongs to category C1.The ATM then restores the state to consistency point C1 (9). It furtherdetermines that reachable consistency points CP3 (12) and CP6 (13)belong to the same consistency category C2 while consistency point CP5(14) belongs to consistency category C3. Transaction C is then executedas a corrective transaction, transitioning state from consistency pointCP1 (9) to consistency point CP4 (15), and then Transaction D isexecuted transitioning state from consistency point CP4 (15) toconsistency point CP6 (13)—an acceptable state—where it commits. Asecond alternative would have been to execute Transaction C as acorrective transaction, transitioning state from consistency point CP1(9) to consistency point CP4 (15) and then execute Transaction Etransitioning state from consistency point CP4 (15) to consistency pointCP5 (14)—another acceptable state—where it commits.

FIG. 3: Transaction A begins to use resource sets RS0 (17) and RS1 (18),which are both in a consistent state, at consistency point CP1 (19).Both transition to durable consistency point CP3 (20), at which pointTransaction A notifies the ATM that it will not subsequently modify RS1.Transaction B begins with resource set RS2 (21) in a consistent state atconsistency point CP2 (22) and transitions it to consistency point CP4(23). At CP4 Transaction B notifies the ATM that it requires resourceset RS1 to continue. The ATM transfers (24) both control and the stateof resource set RS1 at CP3 (20) from Transaction A to Transaction B atconsistency point CP4 (23). If no errors occur subsequently, TransactionA continues, modifying resource set RS0, transitioning its state fromconsistency point CP3 (20) to consistency point CP5 (25) and commits.Likewise, Transaction B continues in the absence of subsequent errors,modifying resource sets RS1 and RS0, transitioning from consistencypoint CP4 (23) to consistency point CP6 (26) and commits.

If an undesirable condition E1 (27) occurs in Transaction A subsequentto consistency condition CP3 (20) and prior to commit, and afterTransaction B has committed or is in-flight, the ATM simply restores(28) resource set RS0 to consistency condition CP1 (19). If TransactionB has aborted, the ATM also restores resource set RS1 (18) toconsistency condition CP1 (19). (It is also possible to restore toconsistency condition CP3 (20) and re-run the work that affects onlyRS0; although this is not shown in the diagram.)

If an undesirable condition E2 (30) occurs in Transaction B subsequentto consistency condition CP4 (23) and prior to commit, and Transaction Ahas committed or is in-flight, the ATM restores (31) resource set RS2 toconsistency condition CP2 (22) and restores (32) resource set RS1 toconsistency condition CP4 (23). If Transaction A is in-flight, the ATMalso transfers (33) control of resource set RS1 (80) to the TransactionA context (18). If Transaction A has aborted, it further restoresresource set RS1 (18) to consistency condition CP1 (19). (Again, it isalso possible to restore to consistency condition CP4 (23) and rerun thework that affects both resource sets RS1 (80) and RS2 (21), withouthanding control over RS1 (80) back to the Transaction A context,although this is not shown in the diagram.)

FIG. 4: The ATM analyzes and rewrites Transaction D from the InitialDefinition (on the left hand side) to the re-structured EnhancedDefinition (on the right hand side). Directives are inserted regardingfavoring (34) (35) and (36), to assert consistency points (37)(38), andto deallocate resources (39)(40)(41). The “Read Z” step is performedearlier (42), thereby optimizing efficiency. The “Write Y=Y+ΔX” step isalso performed earlier (43), thereby enabling both interim assertion ofconsistency points (37)(38) and the early deallocation, after its lastuse in the transaction, of each resource (39)(40)(41).

FIG. 5: This shows the scheduling of four concurrent transactions E, F,G, and H. The ATM determines from dependency information thatTransaction E consists of consistency groups CG1 (44), CG2 (45), CG3(46), and CG4 (47), that Transaction F consists of consistency groupsCG5 (48) and CG6 (49), that Transaction G consists of consistency groupsCG7 (50), CG8 (51), and CG9 (52), and that Transaction H consists of asingle consistency group CG10 (53). It further determines that CG6 (49)shares a dependency with consistency groups CG1 (44), CG3 (46), and CG4(47), CG9 (52) shares a dependency with consistency group CG1 (44), andthat there are no other dependencies among the transactions. TransactionH is not in the same conflict class as E, F, or G. Given thisinformation, the ATM begins Transactions E, F, and H at time t0 (54),scheduling consistency groups CG1 (44), CG5 (48), and CG10 (53) forimmediate and concurrent execution. At time t1 (55) after consistencygroup CG1 (44) completes, it schedules consistency groups CG2 (45), CG3(46), CG4 (47), and CG7 (50) to run concurrently. At time t2 (56) afterconsistency groups CG2 (45), CG3 (46), and CG4 (47) have completed,Transaction E commits. After consistency group CG7 (50) of Transaction Gcompletes at time t3 (57), consistency group CG8 (51) is scheduled torun. Also at time t3 (57) after Transaction E has committed, consistencygroup CG6 (49) of Transaction F is scheduled to run; and then at time t4(58) the ATM schedules consistency group CG9 (52) to run. (IfTransaction E has already committed, the ATM can schedule consistencygroups CG8 (51) and CG9 (52) of Transaction G to run concurrently,although this is not shown in the diagram.) Because Transaction H cannotpossibly be in conflict with Transactions E, F, and G, it is permittedto run to completion without further scheduling and without isolationotherwise enforced. At some time t5 (59) all the transactions will havecompleted and committed.

FIG. 6: The ATM, in the preferred embodiment, contains all of thesubunits referenced in this diagram. Due to the complexity of potentialinterconnectivity, which may be dynamically rearranged, it is infeasibleto display all possible interconnections and hierarchies.

The Parser (60) has responsibility for interpreting or compilingtransaction definitions, which it may receive from an external source orby reference to a transaction definition stored in the Repository (71)via the Repository Manager (61). The Parser may forward interpreted orcompiled transaction definitions to the Repository Manager (61) fordeferred execution or to the Execution Manager (62) for immediateexecution. The Execution Manager (62) processes transactions, allocatingand deallocating transaction contexts, passing directives andinstructions to the appropriate ATM components, and orchestratingtransaction scheduling, commit, rollback, and roliforward. TheConsistency Manager (63) has responsibility for automatic identificationof consistency points and verification of asserted consistency points.The Correction Processor (64) has responsibility for correlatingabnormal conditions and consistency points, either by directassociation, or through condition categories or consistency classes.Based on the transaction definition and possibly a business processdefinition, it may use various techniques to discover, optimally select,or create a corrective transaction and submit it to the ExecutionManager (62). The Dependency Manager (65) has responsibility forinterpreting dependency directives, detecting dependencies, identifyingconsistent groups based on dependencies and asserting the correspondingconsistency points. The Restructuring Processor (66) has responsibilityfor altering the order of transaction steps based on information fromthe Repository (71), the Consistency Manager (63), and the DependencyManager (65). The Repository (71) is also responsible for includinginternally derived resource management and consistency directives in thetransaction definition. The Resource Manager (67) is responsible foraccessing and updating resources, allocation management, scheduling,resource isolation, maintaining cache, and other resource constraints.The Resource Manager (67) is also responsible for detecting resourcerequirements, implementing resource management directives, and providingresource management directives to the Restructuring Processor (66). TheRepository Manager (61) is responsible for coordinating all storedinformation, including dependencies, transaction definitions,associations, condition classes, consistency categories, subscriptions,and so on. The Publication/Subscription Manager (68) is responsible forprocessing publication and subscription definitions, detectingpublication events, and notifying appropriate subscribers of publicationevents. The Recovery Manager (70) is responsible for evaluating,selecting, and directing recovery options, passing control to theCorrections Processor (64) if a corrective transaction is selected. TheIsolation Manager (69) interacts with the Resource Manager (67) and moreintensively the Resource Scheduler (72) to ensure the Isolation Propertyfor every resource and transaction is correctly maintained, sendingconstraints and dependency information as needed to thePublication/Subscription Manager (68) and the Dependency Manager (65).

DETAILED DESCRIPTION OF THE INVENTION

Businesses work in an imperfect world, and attempt to impose their ownorder on events. Constantly in a state of flux, they persist in imposing‘acceptable’ states through the efforts of all their employees, from theaccountants running yearly, quarterly, weekly, or even daily accounts,to the zealous (or indifferent) stock clerks managing inventory.

When an error occurs, it is recognized because the result differs fromwhat is expected. Results can differ from expectations in several ways,including computational results, resources consumed, catastrophicfailures to complete the work, excessive time to complete the work, andso on. Typically, the business does not know either the explicit causeof an error or its full impact. For example, it may not know if data wascorrupted (wrong account number), the procedure mistakenly performed(9*6=42), or the wrong procedure used (multiplied instead of divided).Obviously errors (including those of timeliness and resource overuse)must be prevented to the degree possible. Any undesirable effects oferrors must be repaired and the desired effects asserted(correction—traditionally by resubmitting the corrected transaction).Furthermore, finding out which error occurred, and enabling those errorsto be tracked, over time becomes more valuable than merely repairing andcorrecting each as it occurs. In this way the business can discoverwhere it needs to focus attention on improving the overall process andimproving its efficiency.

Overview of the Invention

The present invention is a method, consisting of a coordinated set ofsub-methods, which enables efficient transaction processing and errormanagement. By contrast with prior approaches, it is extensible tocomplex transactions and distributed business environments, and isparticularly well-suited to business process management. The sub-methodsare consistency points, corrective transactions, transaction relaying,lookahead-based resource management, and dependency-based concurrencyoptimization.

In the preferred embodiment of the present invention, a systemimplementing this invention (1) continually transitions betweenautomatically-detected stable (i.e. logically correct and permissiblydurable) acceptable states (each is also known as a ‘consistencypoint’), ensuring rapid and minimal recovery efforts for any error; (2)automatically enables inter-linked, possibly distributed, transactionsto share intermediate work results at ‘consistency points’ throughtransaction relaying, moving from one acceptable state to the next; (3)efficiently manages I/O and storage use by identifying for eachtransaction (or procedure), in advance of execution, a set of data,resources, and operations depended upon by that transaction to move fromone consistency point to its succeeding consistency point; (4) schedulesthe use of those resources in such a manner as to improve efficiency andconcurrency while permitting dynamic scheduling of unplannedtransactions; and (5) automatically implements repair and correctiveefforts whenever a mistake is identified.

In an extension of the preferred embodiment, the system shares resourcesand data that are touched or handled by multiple subordinate parts of acomplex or distributed transaction, rather than duplicating the same andletting each part have its own copy, or rather than locking all otherparts out while each particular part operates with that same data and/orresources. This ‘overlap’ in effect becomes a window into the entirebusiness' processes, a window that moves as transactions, or partsthereof, successfully and correctly complete—or when an error occurs,the effects are repaired, and failed work corrected. Moreover, all thatneeds to be maintained during the process of a particular sub-part ofthe transaction is the ‘delta save’, that is, the changes since theknown consistency point which the chain last reached.

In yet a further extension, a system engages in transaction management,by implementing transaction lookahead, or managing transactiondependencies, or any combination thereof.

Each of the sub-methods are further detailed and explicated below.

1. Consistency Points

Through the course of a transaction, it may happen that the set ofresources enters a consistent state from time to time. Such a consistentstate is referred to as a consistency point and may be detectedautomatically by the transaction manager or some other softwaresubsystem, or may be manually asserted by the user (possibly via programcode or interactive commands). Numerous methods for automatic detectionof consistency exist in the literature and are well-known. Consistencypoints may be durable or non-durable. Durability determines thecircumstances under which they may be used. In effect, a consistencypoint is a savepoint with the added requirement of consistency and theoptional property of durability. When the system detects a potentiallyrecoverable error, it can rollback to the consistency point by restoringthe state as of the consistency point (exactly as it might to asynchronization point or were a savepoint to have been asserted). It maythen optionally and automatically redo the work that was subsequentlydone (by, for example, reading the log or log buffers) in the hope thatthe error will not recur. This might be the case when, for example, (1)a deadlock is encountered (in which case the consistency point need notbe durable) or (2) power fails (in which case the consistency point mustbe durable). Numerous methods exist for recovery to a synchronizationpoint or savepoint, and are well-known. Rollback to a consistency pointwill, in general, be more efficient than rollback to the beginning of atransaction in a system which does not support consistency points.

These examples illustrate some of the value of consistency points:

-   -   automatic deadlock recovery—When a deadlock is detected, the        usual response is to return control to the user (or program)        with an error message or to select one of the participating        transactions and abort it. With consistency points, the system        can implement an internal retry loop which makes it very likely        that the deadlock condition will not recur (for a variety of        reasons). Such an internal retry loop is much more efficient        than one implemented by the user (the usual approach to deadlock        recovery). It is clearly more efficient than having the system        automatically break deadlocks by the method of picking a        “victim” of those transactions involved, and forcing it to fail,        and more reliable than expecting the correct response to have        been encoded into a program by a programmer.    -   automated savepoints—Savepoints are established by manual        declaration of the user, either interactively or through a        program, and as an added step in a transaction. By contrast,        consistency points can be established by automatic detection        that some particular set of one or more pre-defined consistency        conditions have been met. This enables both automatic and manual        rollback to the most recent consistency point.    -   categories of consistency points—Users (including business        users, system designers and administrators) can define multiple        sets of consistency conditions so that multiple, different        categories of states, each consistent with respect to a        particular set of consistency conditions, can be detected and        named. Detection can be automatic and naming can be according to        a pre-defined naming convention. A consistency point of category        C1 is more general than a consistency point of category C2 if        every consistency point of category C2 also belongs to category        C1. Other rules of set theory apply and can be used to simply        testing for consistency points of one or more categories using        methods well-known to one familiar with the art.    -   categorized rollback—By establishing a relationship between a        type or class of error (based, for example, on error code) or        other detectable condition, and a category of consistency point        (possibly based on name), the system can then rollback a        transaction to an associated category of consistency point when        that error is detected. If the associated category of        consistency point has not been detected or asserted, traditional        error handling techniques can be used. Because both the        relationship between error type and category of consistency        point, and the consistency conditions to be detection can be        changed, the behavior of the system can be easily maintained. In        one embodiment, this can be done without the necessity of        modifying transaction processing programs since the relationship        and the consistency conditions can be held in a database (for        example) and determined at program execution time.    -   commit processing—When a transaction commits, the standard        approach is to make the final state of all affected resources        durable. If a transaction contains one or more durable        consistency points, the state of resources that have not been        modified since a consistency point involving those resources        need not be made durable during commit processing. This, in        effect, permits commit processing to be spread out over time and        possibly using parallel processing, thereby eliminating hotspots        and speeding commit processing.    -   power failure recovery—When power fails, the usual response is        to enter system recovery processing once it has been restored.        The canonical approach to system restart of transaction        management systems is equivalent to first initiating rollback of        each transaction uncommitted at the time of power failure, and        then to initiate rollforward. If the rollback phase for        uncommitted transactions is to the most recent consistency        point, followed by notification to the user as to “where they        were” according to system records, the amount of work that the        system must do in order to restart and which the user must then        redo, is substantially decreased. A similar approach can be used        for recovery from certain other types of failure, such as        storage media failures, and incorporating other standard        recovery mechanisms as appropriate.

Unlike all prior art, the present invention's use of consistency is farmore consistent, logical, and powerful. Most present-day DBMS products(e.g., IBM's DB2 or Oracle's Oracle 9i) implement only an extremelylimited concept of consistency enforcement, generally known as integrityrule or constraint enforcement. However, while these products may verifythat the changes made by a transaction are consistent with some subsetof the known integrity rules at various times (e.g., after each row ismodified, after a specific transaction step is processed, or beforetransaction commit), no product currently on the market establishes anduses internally valid and logically consistent “checkpoints” (i.e.consistency points) to which the transaction can recover (perhapsautomatically). Nor do they permit the user to request the establishmentof consistency points, to assert consistency points (except implicitlyand often erroneously at the end of a transaction), or separateconsistency points from synchronization points (as, for example, betweenvolatile memory and durable storage). Other advantages and uses ofconsistency points are further detailed below as they interact withother elements of this invention.

By extension, the method of consistency points can be applied topseudo-transactions, physical transactions, logical transactions, andbusiness transactions.

2. Transaction Relaying

Transaction relaying refers to the method of moving the responsibilityfor resource isolation and consistency in a window from transaction totransaction, much like the baton in the relay race, and permittingsharing of that responsibility under certain conditions (explainedbelow). By further analogy, and for the purpose of explainingtransaction relaying in its most simplified form, two transactions A andB become like runners in a relay race (football game). The baton(football) is a resource that A must pass to B without dropping(corruption). A conflicting transaction C is like a member of thecompeting team that would like to acquire control of the baton(football) from A and B. By passing the baton without either runnerslowing down (permitting B to gain access to the resource held by Aprior to commit), there is no opportunity for the competing team toacquire control (for conflicting transaction C to gain control of, letalone alter the resource). Furthermore, the entire process is much moreefficient than if the runners were to stop in order to make thetransfer.

Consider a transaction B having either a semantic or resource dependency(or both) on transaction A. For example, suppose that a particularbusiness process consists of transactions A and B, and that there is anintegrity rule or constraint, or a dependency that requires transactionB must always follow A because it relies upon the work done by A. Inother words, some portion of the final state of resources affected by A(the output of A) is used as the initial state of resources required byB (the input of B). By the definitions of transaction and consistencypoint, the final state of A is a consistency point, even before Acommits. Under the usual approaches we must either (1) accept thepossibility that the final state of A is altered by some transaction Cbefore B can access and lock the required resources (the sequentialtransaction scenario), (2) accept the possibility that the state ofresources needed by B is different than the state of those sameresources as perceived by some other transaction (chained transactions),or (3) run transactions A and B combined in a distributed transaction,accepting the fact that all resources touched by either A or B will belocked until B completes (the distributed transaction scenario).

Transaction relaying recognizes the fact that A and B may share thestate of the resources that B requires at least as soon as A enters thefinal consistency point for those stated resources and has made thatfinal state durable (assuming durability is required). Unlike chainedtransactions, it need not wait until A is ready to commit. It need noteven wait until locks are released. Rather, the transaction manager,lock manager, or some other piece of relevant software either transfersownership of those locks directly to B or establishes shared ownershipwith B (as long as only one transaction has ownership of exclusive lockson a resource at any given time if the ACID properties are desired), andnever releases them for possible acquisition by C. Unlike the sequentialtransaction scenario, there is no possibility that C will interfere inthe execution of B. Unlike the chained transaction scenario, transactionrelaying does not require transaction A to have committed, the beginningof transaction B to be immediately after the commit of transaction A,the commit of A and begin of B to be atomically combined in a specialoperation (indeed, B may already have performed work on otherresources), transactions A and B to be strictly sequential, ortransaction B to be the only transaction that subsumes sharedresponsibility for resources previously operated on by transaction A.Unlike the distributed transaction scenario, resources held by A, butupon which the initial state of B does not depend, are released as soonas A completes and there is no two-phase commit overhead. Unlike splittransactions, transaction relaying does not introduce artificialtransaction contexts, can be fully automated without sacrificingconsistency, and yet enables collaborative transaction processing inwhich work groups can communicate about the status and intermediateresults of their work (including negative results).

An extension of the method is to permit transaction B to have doneadditional work on other resources prior to the consistency pointdiscussed above. Another extension of the method is to permit A to dowork on other resources after the consistency point discussed above. Afurther extension of the method is to permit transaction A to do workafter the consistency point discussed above, so long as no consistentstate on which transaction B depends is ultimately altered bytransaction A.

Yet another extension of the method is to permit transactions other thantransaction B to have a similar relationship to transaction A, involvingpossibly different resources and possibly different consistency points.The method preserves the ACID properties of all transactions as long asno more than one transaction in effect has responsibility formodification of a shared resource at any particular time, and thattransaction can rollback the state of those resources to the most recentdurable consistency point in which they are involved. If durability isnot a recovery requirement (as, for example, during deadlock recovery),then the consistency point need not be durable.

By extension, under transaction relaying, if the initial state of aresource as needed by one or more transactions including B happens to bean intermediate state of that resource produced by A, it may be madeavailable to those transactions long before A commits if the followingconditions are true (other conditions may enable this as well): (1) atmost one transaction of those sharing responsibility for recoverability,isolation, and consistency of resource modifies those resourcessubsequently, (2) the intermediate state is a consistency point, and (3)the intermediate state is recoverable (though not necessarily durable).These conditions are intended to guarantee that the result of A and Bwith transaction relaying around a consistency point is equivalent tosome serializable interleaving of transactions D, E, F, and G, where Dis the work that is A does before the consistency point, E is the work Adoes afterward, F is the work B does before the consistency point, and Gis the work B does after the consistency point. Other sets of conditionsor rules that would produce this result are possible.

Moreover, the intermediate state produced by A could just as easily havebeen produced by B (or other specific transactions) had the instructionsto do so been inserted in B (or those other transactions) at some pointprior to that at which the intermediate state of A is accessed by B.Transaction restructuring such as this under transaction relaying may beused to improve processing efficiency and performance. By furtherextension, under transaction relaying a group of transactions can sharemultiple intermediate states. This may become important when schedulingsubordinate parts of a complex transaction for the most efficientprocessing; transaction relaying allows a transaction managementfacility to balance work amongst ‘subordinate’ transactions by includinginstructions such as those described in all subordinate transactions (orat least establishing the means for such inclusion when needed) and thenselecting which of those subordinate transactions actually perform thework so as to promote efficiency, either in advance of execution ordynamically during execution.

In transaction relaying, both A and B share control over isolation ofshared resources. For example, they would share ownership of the lockson the shared resources is locking were used to control isolation.Optimally, and in order to preserve the consistency and isolationproperties, both A and B must have completed before transactions otherthan A and B perceive locks on those resources to have been released. IfB completes before A, B relinquishes its lock ownership and A retainslock ownership until A completes. If A complete before B, A relinquishesits lock ownership and B retains lock ownership until B completes. Inthis way, both A and B (all owners of the shared resource) must releaselocks on shared resources in a manner consistent with the type of lockheld (e.g., share versus exclusive locks) and the concurrency controlmechanism before other transactions can access the resource. If Acompletes before B, B has lock ownership. If A and B completesimultaneously, or whenever A and B have both completed, lock ownershipreverts to the resource manager and so locks are effectively released.In order to preserve serializability, the two-phase locking protocolapplies to the shared resource as if a single transaction were involved.The usual rules of lock promotion or demotion apply. Insofar as externaltransactions (that is, transactions not involved in sharing theresources in question via transaction relaying) are concerned, aresource shared by A and B is locked in the manner which is mostexclusive of the types of access requested by A and B. Similar rules mayapply to lock scope escalation (e.g., row to page) and to transactionrelaying involving more than two transactions.

By obvious extension, transaction relaying can be used in systems thatemploy non-standard concurrency control schemes and enforce isolationthrough mechanisms other than locking; appropriate adjustment to thespecific mechanism that enforces isolation is then required to permitthe sharing of resources at consistency points.

By extension, transaction relaying enables a transaction managementfacility (or other appropriate software systems) to remove redundantoperations performed by a group of transactions and assign thoseoperations to a specific transaction or transactions, thereby improvingthe overall efficiency of the system. Such a facility can determinewhich operations among a group of transactions are redundant throughautomatic means well-known to those familiar with the art (for example,pattern matching), to be informed of those redundant operations by someother agent such as a human individual knowledgeable about the intent ofthe transactions in the group, or some combination of the two.

Transaction relaying can be extended to arbitrarily complex collectionsof concurrent and interdependent transactions, even if thosetransactions were running under distinct transaction managers in adistributed computing environment. In such cases, the means forisolation enforcement will typically be distributed, but two-phasecommit processing is not required across those transactions involved intransaction relaying (although it need not be precluded). Numerousmechanisms for distributed isolation enforcement exist and will be wellknown to one familiar with the art. Indeed, once the method oftransaction relaying has been explained as it applies to twotransactions (“A” and “B”), extensions to arbitrarily complexcollections of concurrent and interdependent transactions, includingthose spread across a distributed computing environment howevergeographically dispersed or however many business entities may beinvolved, will be obvious to one trained, competent, and versed in theart.

By extension, this method of the present invention can be implemented sothat transactions publish their states and/or consistency conditions atconsistency points and permit other transactions to subscribe to thestate of associated resources. A variety of methods may be used todetermine which of the subscribing transactions will gain writepermission over the associated resources and in what order. By furtherextension, the group of subscribing transactions can be treated tovarious methods of concurrency optimization, including the method ofdependency based concurrency optimization described below. By extension,the method of consistency points can be applied to pseudo-transactions,physical transactions, logical transactions, and business transactions.

In another extension of the present invention, a locking flag is used todenote the dependency upon each particular resource (including dataelements), and to transfer control over and responsibility for such tothe transaction which has yet to attain a consistent state with thesame, thereby allowing intermediate, partial, or distributedtransactions to process and reach completion or acceptable stateswithout necessitating the entirety of a complex or distributedtransaction to successfully conclude.

3. Corrective Transactions

Corrective transactions provide an alternative to both compensation androllback in circumstances in which the desired result of a transactioncan be understood as producing a state that meets a particular set ofconsistency conditions. For example, an ATM transfer transaction mayhave as its key consistency conditions the crediting of a specificaccount by a specific amount of money, and maintaining a balance ofdebits and credits across a set of accounts (including the specifiedone).

In the event that an error occurs during transaction processing, acorrective transaction appropriate to the error is invoked. Rather thanrestoring the initial state of a set of resources as would either arollback or a compensating transaction, a corrective transactiontransforms or transitions the state of the affected set of resources toa final state which satisfies an alternative set of consistencyconditions (integrity constraints and transition constraints). Thealternative set of consistency conditions constrain the final state toone of possibly many acceptable states and may be, for example,completely distinct from the initial set or may be a more generalcategory of consistency conditions. For example, consider a simplebusiness process consisting of a two predefined but parameterizedtransactions, a funds-transfer transaction (parameterized for transferamount and two account numbers) and a loan transaction (parameterizedfor loan amount but with fixed account number). If an attempt totransfer a specified amount between two accounts fails because ofinsufficient funds, an automatic corrective transaction might loan theuser the required funds, thereby expanding the consistency conditions toinclude an account not owned by the user with respect to balancingcredits and debits. In this example, the corrective transaction might bemanually predefined by the bank and caused to run as part of an errorhandling routine. Similarly, rather than debiting the explicitlyspecified account (for example, checking), it might debit an alternateaccount (for example, savings or an investment account).

This method of the present invention replaces the usual fixed set ofconsistency conditions with a category of such sets and invokes anauxiliary set of actions (the corrective transaction) that willtransform the current state into one that satisfies some set ofconsistency conditions belonging to that category. That is, thetraditional concept of the consistency property for transactions isrefined such that the options for achieving a consistent state in thecompletion of a transaction are broadened. For each set of consistencyconditions defining the end state of a transaction, each of the othersets of consistency conditions belonging to its category constitute anacceptable set of consistency conditions. This concept of acceptablesets of consistency conditions mimics the real world of business, inwhich errors are common and a strictly pre-determined result of work isnot possible. Rather, those who perform work in a business contextstrive to achieve some acceptable result, where acceptability isdetermined by satisfaction of a number of alternative sets ofconstraining conditions and is often associated with business risk andopportunity assessment.

This method is particularly valuable when a set of linked interdependenttransactions is involved and a flat transaction model does not apply.For example, a classic problem of this nature involves the schedulingand booking of a travel itinerary. It is not uncommon that the idealrouting, carrier, and timing are unavailable for every segment of amulti-segment itinerary, but that some compromise alternative isavailable. Each segment is often reserved and booked via a separatetransaction, and cancellation penalties after more than a few minutesmay preclude arbitrary rescheduling. Possible compromises constitutealternative consistency conditions, possibly ranked by the traveler'spreference. If a transaction to book a particular segment of theitinerary fails, a corrective transaction can book an alternative forthat segment. For example, it might involve booking a flight to anairport near the original segment destination and a rental car with theattendant compromise of less time between flights. Similarly, acorrective transaction might cancel a certain number of alreadyscheduled segments in order to assert a more viable alternativeschedule. The segments to be cancelled might be selected, for example,based on minimizing any negative financial impact on the overall cost ofthe itinerary.

Business processes do not always lend themselves to such simple modelsas those assumed by existing approaches to transaction processing: oftenthey involve interleaved multi-hierarchies and networks. The processes abusiness uses to correct for errors do not always return the business toa prior state as is assumed in other approaches to transaction errorhandling (it would be to costly to do so). Rather, the business istransitioned to some acceptable state and the nature of this state madeavailable to those portions of the business that have some dependenceupon it. Notice the repeated reference to “some acceptable state”instead of the more familiar technical notion of a specific internallyconsistent transaction end state. Obviously, businesses do not follow arigid set of rules of consistency as a database might. However, itshould be equally obvious that some action will be taken if the businessis not in an acceptable state. Rather than ignoring this approach,depending entirely on manual corrections (difficult if not impossible attoday's transaction volumes), or insisting that the map must be theterritory, the present invention actively attacks the problem bydefining consistent and acceptable states to which the business processwill move when it becomes flawed, states from which it may resume normaltransaction management once again.

In a business process, the various constituent and linked transactions(including pseudo-transactions) often create a complex network of stepswith many decision branches and concurrent sub-processes. Many portionsof the process are designed to handle exception or error conditions. Ifa transaction fails, then rollback and redo, or rollback of atransaction that includes a decision branch, may not be a reasonableoption. In particular, such a recovery mechanism will often consume somuch time or other resources that the business process is no longerviable. The method of corrective transactions requires that one identifya state that would have been reachable had a different portion of theprocess been activated (that is, a different branch had been taken), andthat satisfies an acceptable set of consistency conditions. Each suchstate is designated as an alternative end state. The failed transactionis then rolled back to the most recent state for which a transaction orset of linked transactions (the corrective transaction) exist that willtransition from the consistency point to an alternative end state. Thispoint may be the current error state (and possibly inconsistent), or itmay be the most recent consistency point. The corrective transaction isthen run.

The method of corrective transactions requires that each business,logical, or physical transaction submitted to the system, and which isto be subject to the benefits of the method, be identified according tothe consistency conditions that will be enforced on the set of resourcesaffected by that transaction or that such consistency conditions beautomatically discoverable by the system. Such consistency conditionsmight, for example, be stored conveniently in an online repository so asto be accessible to the transaction manager, other appropriate software,or a human individual. Whenever an error occurs that results in thefailure of the transaction (thereby failing to establish a state amongthe preferred final states), the failed transaction is returned to arecoverable consistency point (the most recent one in the preferredembodiment). The error is classified (in the preferred embodimentaccording to the nature of the most recent consistency point) and thecorresponding set of consistency conditions on the affected resources isestablished. A transaction (the corrective transaction) is then invokedwhich will transform the affected resources from the state of the mostrecent consistency point to a state that most closely approximates theintended state and satisfying the new consistency conditions (we referto these as “acceptable conditions”), assuming that such a transactionexists. In the event that no such corrective transaction exists, thefailed transaction is then returned to an even earlier consistencypoint, and an appropriate corrective transaction invoked. The process isrepeated until an acceptable set of consistency conditions is reached.By extension, this iterative process might be replaced by othertechniques which achieve an equivalent result, examples of which aredescribed below.

In one embodiment, the establishment of a target set of acceptableconditions is determined automatically, for example by means as diverseas rule-based inference based on error class, the use of a theoremprover to determine conditions which will permit the transaction tocomplete, or a catalog lookup. In another embodiment, the establishmentof acceptable conditions (or equivalently a transaction that willproduce those conditions) is determined by an interaction with asuitably authorized person. One familiar with the art could easilyspecify numerous other means to determine the acceptable conditionsbased on a combination of class of error, recoverable consistency pointswithin the failed transaction, and consistent states accessible byexecuting one or more transactions.

In one embodiment, the determination of the steps in the correctivetransaction (that is, its definition) are fixed in advance and there isone such transaction for each class of error. In another embodiment, thesteps which constitute the corrective transaction (which themselvesmight be either implicit or explicit transactions) are determinedautomatically using, for example, a theorem-prover which reasons fromthe consistency point (initial state as axioms) to a final state whichmeets the acceptable conditions, the steps of the proof being the stepsin the corrective transaction. In an alternative embodiment, backchaining is used to start from an arbitrary, potential state that meetsthe acceptable conditions and as defined, for example, as part of anoverall business process schema, incorporating steps from a pool ofpre-defined steps, operations, or transactions until the state given asthe consistency point was reached. The incorporated steps in reverseorder of discovery then define the steps of the corrective transaction.In such an embodiment, both the failed transaction and the correctivetransaction might be business transactions consisting of orderedactivities or transactions, thus each being portions of a businessprocess, possibly involving human interaction to accomplish businessactivities.

In another embodiment the selection of acceptable conditions, acceptablestate, and sequence of steps that constitute the corrective transactionmay be optimized using one or more of a variety of optimizationtechniques (these will be well-known to those familiar with the art) tomeet given optimization goals. For example, the optimization goals mightoptimize for minimum resource usage, shortest execution time, leasthuman interaction required, and the like. Similarly, the members of theset of acceptable conditions may be possibly prioritized or orderedbased on some arbitrary optimization criteria, and subsequently selectedas needed through automated or manual means.

It is well within the means of the average professional skilled in therelevant arts to extend the concept of a corrective transaction to morecomplex scenarios involving multiple transactions of which is desiredsome group behavior. A common example occurs in practice in the contextof process management and workflow. By a process we mean a collection ofinterdependent transactions (including possibly business transactions,logical transactions, and pseudo-transactions) that transform the stateof a set of resources in a well-defined though not necessarily strictlydeterministic manner, that manner being identified by a collection oftransition rules (integrity constraints) which specify the permissible(partial) orderings of those transactions in time. Certain connectedsubsets of these transactions may themselves have atomic propertiesthough not all of the ACID transaction properties, and so are consideredpseudo-transactions. In some embodiments of a process, some or all ofthe transactions constituting the process may not be true transactionsin the strict sense of the word and may be referred to as tasks,activities, business functions, and the like. (Indeed, the individualoperations of any type of transaction can be considered to be aprocess.)

For example, it may be difficult in practice to enforce the isolationproperty across these transactions: thus, the result of some transactiondeep in the dependent chain (or tree or net) may influence the outcomeof some transaction that is not one of those in the atomic group. Forpractical reasons (performance, lack of control, etc.), we may not beable to use distributed transactions or compensation. Both distributedtransactions and compensation may furthermore be undesirable simplybecause they return the process to an initial state for the atomic groupof transactions rather than moving it forward to an acceptable state andmeeting acceptable conditions.

The method of corrective transactions permits analysis of a processschema of which a failed transaction is a part, the supplementing of theprocess as necessary with interactive input, and determination of apartially ordered set of transactions or actions (this set constitutingthe corrective transaction) that will transition from the current stateto a state that is approximately—in terms of consistency goals—the sameas would have been achieved had all gone well. How closely the correctedstate approximates the one that would have resulted is entirely underthe control of the system designer, constrained only by limitationsimposed by the intended application or the real world.

A process often contains multiple alternate paths specifying the work tobe done and leading to various states or conditions satisfying variousconsistency conditions, the alternate paths being selected eithersingularly or severally at a branch point in the process. Thus, from abranch point it may be possible to achieve a certain amount of work andan associated acceptable state in multiple ways, some more “consistent”or more ideal than others. It may even be able to achieve exactly theideal acceptable state by an alternate path. Such an alternative pathconstitutes the corrective transaction. It may involve using differentresources, require doing some work that would not otherwise have beendone, require leaving some otherwise desirable work undone, requiresupplementing the process with interactive input, and so on.

In further extension to the preferred embodiment of this submethod ofthe present invention, a cost-benefit approach (similar to thatsometimes applied to compensatory transactions) is used. Traditionalcompensating transactions are used when the combined cost of undofollowed by redo is relatively small and has minimal impact on the restof the system, when there are no context-dependent side-effectsinvolved, when there are commutable transactions at every stage, or whenan undo followed by redo is unlikely to cause errors in some otherportion of the system (given the resource cost and especially in termsof time delays). Otherwise, a corrective transaction is used totransition directly to an acceptable state which then need not be theoriginal target state.

In a further extension of the preferred embodiment of this submethod ofthe present invention, this method permits manual input to define andapply the corrective transaction to the current state to reach thedesired acceptable state.

In a further extension of the preferred embodiment of this submethod ofthe present invention, this method uses previously-determined,policy-driven programming implementing pre-set rules of the business toderive, from the difference between the desired acceptable state and thecurrent but incorrect state the nature of the corrective transaction,and then automatically applies the corrective transaction to the currentstate to reach the desired acceptable state.

In a further extension of the preferred embodiment of this submethod ofthe present invention, this method uses methods such as goal-orientedprogramming or genetic algorithms to derive, from the difference betweenthe desired acceptable state and the current but incorrect state thenature of the corrective transaction, and then automatically applies thecorrective transaction to the current state to reach the desiredacceptable state.

In one alternative extension of the above further extension to thepreferred embodiment of this submethod of the present invention, thismethod uses backward-propagating logic (‘back propagation’) to derive,from the difference between the desired acceptable state and the currentbut incorrect state the nature of the corrective transaction, and thenautomatically applies the corrective transaction to the current state toreach the desired acceptable state.

In an alternative extension of the last-named extension of the presentinvention, the method uses matrix, linear, or other algebraic algorithmsto calculate the least-cost, highest-benefit corrective transaction tothe current state to reach the desired acceptable state, and thenautomatically applies the corrective transaction to the current state toreach the desired acceptable state.

In another alternative extension of the present invention, the methoduses single-element redefinition algorithms to calculate the least-cost,highest-benefit corrective transaction to the current state to reach thedesired acceptable state, and then automatically applies the correctivetransaction to the current state to reach the desired acceptable state.

In another alternative extension of the present invention, the methoduses any of the above-named techniques to calculate the correctivetransaction to be applied to the current state, but only attempts tosatisfy the minimally-acceptable set of conditions when attempting toderive the corrective transaction.

In another alternative extension of the present invention, the methoduses any of the above-named techniques to calculate which correctivetransaction will reach the closest possible alternative end state to theminimally acceptable consistent state, applies the correctivetransaction, and then reports the remaining difference for manualimplementation of the final step to reach said minimally acceptableconsistent state.

By extension, the method of consistency points can be applied topseudo-transactions, physical transactions, logical transactions, andbusiness transactions.

4 Lookahead-Based Resource Management

Existing resource management methods do not take into account availableinformation about either the operations and resources involved in atransaction, or the transactions (and therefore the resources) involvedin a business process. Thus, for example, if a first step involves arequest to read a data resource and a subsequent step involves a requestto modify that same data resource, the probability of that data resourcebeing found in cache is not influenced by any determination that thesubsequent step will or will not require that data resource. Some DBMSproducts attempt to keep all data resources, once accessed, in cache (orsome other high speed storage). Various algorithms may be used fordetermining when cache, or some portion thereof, can be overwritten (forexample, a least recently used algorithm and its many variants). OtherDBMS products may influence the probability that certain data resourceswill be kept in cache for a longer time based on statistical patterns ofaccess. For example, certain types of requests involve sequentialreading of large amounts of data resources and it makes sense to“pre-fetch” the next group of data in the expectation that thesequential reading will continue. As another example, certain types ofcursor activity in a relational DBMS strongly suggests that the dataresource initially read will be subsequently updated, as with SQLrequests of the form OPEN CURSOR . . . FOR UPDATE . . . None of thesemethods has the advantages of pre-determining the need for resources.

Lookahead-based resource management is a submethod of the presentinvention that enables optimized automation and execution of atransaction or group of transactions, particularly feasible andappropriate for complex transactions as defined above. This isaccomplished by making some or all resources (such as data or otherresources) that will subsequently be used in processing a transaction orgroup of transactions explicitly known to the software responsible forprocessing said transaction or group of transactions in advance of theneed to execute said transaction or group of transactions. The optimizedmanagement of those resources needed to process the transaction or groupof transactions, and possibly other resources, is enabled by means toinform the software responsible for processing and/or optimization (the‘Transaction Process’) of said resources either by directive or byinference in association with the definition of the request for saidprocessing. This is done by making the definition of one or more stepsin a transaction (or group of transactions) known by one of severalmeans to the Transaction Process in advance of the request to processsaid step or steps. From such an advance definition, the TransactionProcess can infer the resources necessary to perform said step or steps.Alternatively, and as a means of further efficiency, the originator ofthe request definition (whether a human, program, or machine) canincorporate the identification of the resources directly in thedefinition. As a means of yet further efficiency, the originator caninclude within the request definition directives that instruct theTransaction Process as to how to optimally manage resources inanticipation of steps of a transaction or group of transactions.

In the preferred embodiment of this submethod, the entire transactiondefinition is made known to the Transaction Process in advance of theinitial request to begin processing that transaction (possibly by nameor some other transaction identifier). The definer of the transactionidentifies at transaction definition-time the data resources that shouldbe highly favored for cache retention, at what step to begin suchfavoring, and at what step to remove or reduce that favoring. As afurther efficiency, these identifications may be aided through automatedtechniques such as monitoring the use of resources while the transactionis being run, thereby identifying those resources and determining atwhich points particular resources are no longer required. In thisembodiment, these resources are accessed once and then maintained incache until the last step that needs said resources. In the event thatthere is insufficient cache, other secondary methods of cache managementmay then be used. As a further efficiency, resources are acquired andreleased at consistency points, thereby reducing the likelihood that anerror or rollback condition will force resources to be released. Thus,as a specific example, a transaction containing a step to read some datafollowed, perhaps with intervening steps, by a step to modify that samedata might be predefined as a stored procedure (for example) and invokedby name. Following the transaction definition, the Transaction Processmarks the data read (as a consequence of the first step) to be highlyfavored for retention in cache until the second step completes. Thecache management algorithms used by the Transaction Process (well knownto those familiar with the art) are augmented to give cache preferenceto data so marked in an obvious manner. In another embodiment, theTransaction Process identifies the resources needed by each step of thetransaction automatically, and further identifies which resources willbe needed multiple times, and at what point those resources may bereleased.

In another embodiment, the Transaction Process further optimizesprocessing by pre-allocating cache, storage space, locks, or otherresources based on advance knowledge of one or more of the steps in thetransaction. In another embodiment, the Transaction Process may alterthe order of execution of the steps in such a manner such that theintended meaning of the transaction is not altered, but resourcemanagement and possibly performance is optimized, as for example,pre-reading all data in such a manner as to reduce disk I/O, to improveconcurrency, or improve parallel processing. Other similar and numerousoptimizations that become possible when one or more of the steps of atransaction are known in advance of the need to process those steps willbe readily apparent to one familiar with the art.

In another embodiment, the definitions of a group of transactionsnecessary to process a particular application are stored in arepository. When a request is made to run the application, theTransaction Process looks up the definition of the transactionspertaining to said application, including all the steps in eachtransaction. The Transaction Process then determines the resourcesnecessary to perform each step, determining at which step said resourcesmust be first acquired, at which step they will last be used, and atwhich step they can be first released. (In an alternative embodiment,the repository also contains identification of all resources necessaryto perform those steps, said resources having been previously identifiedeither by software or human means. In yet another alternativeembodiment, the repository also contains the relative time of said firstacquisition, final use, and first possible release of each requiredresource.) The Transaction Process then applies any of numerousoptimization methods well-known or accessible to one familiar with theart to optimize management of resources in its environment including,for example, data caching, lock management, concurrency, parallelism,and the like.

5 Dependency-Based Concurrency Optimization

The method of dependency-based concurrency optimization enables ascheduling facility to restructure the steps or operations in acollection of one or more transactions so as to optimize concurrency andefficiency. By restructuring we mean changing either the order or thecontext of execution of transactions, steps, or groups of steps so as tobe different from that order or context in which those transactions,steps, or groups of steps were submitted. The purpose of this method of“static scheduling” is to determine which transactions can absolutely berun together without interference, not which ones cannot. If there isdoubt, traditional dynamic scheduling can be used. Dependency-basedconcurrency optimization is an improvement upon traditional transactionclasses and traditional conflict graph analysis in that it provides anew means to determine dependencies and to respond to them usingtransaction restructuring. By augmenting the definition of a transaction(or group of transactions) with the dependencies among steps or groupsof steps of said transaction and its consistency points, whether byhuman or computer means, the identification of which steps must beperformed in which order can be determined using means well-known tothose familiar with the art, including manual means. This informationenables a computer system capable of parallel or concurrent processingto perform those steps or groups of steps which satisfy certain criteriato be performed in parallel or in an order different from the order inwhich they are submitted, and possibly at the discretion of an optimizercomponent. In particular, steps or groups of steps which can optionallybe performed in parallel or in a different order are those that (1) haveno mutual dependencies and (2) are not dependent on any other steps thathave not yet been performed. Said dependencies information and saidconsistency point information may be supplied by any of a number ofmeans. For example, each dependence between every pair of steps might besupplied as a simple instruction “(1,2), (1,3)”, meaning that step 2depends on step 1 and step 3 depends on step 1. Alternatively, theentire set of partially ordered dependencies might be supplied as asingle data structure consisting of, for example, a linked list of treeswith each tree specifying dependencies (a ‘dependency tree’), the linkedlist simply being one possible means of collecting the dependency trees.Similarly, steps can be grouped together such that they have nodependencies with any steps not in the group and such that, if theybegin execution on resources that are in a consistent state, then thoseresources are left in a consistent state when that group of stepscomplete, such a grouping being known as a ‘consistent group’. Aconsistent group bounded by durable consistency points satisfies theformal definition of a transaction, albeit an implicit transaction. Forexample, if steps 1, 2, and 3 form such a group of steps bounded byconsistency points and with the dependencies in the previous example,both dependency and consistency point information might be supplied viathe instruction “<(1,2), (1,3)>”. Numerous other means for supplyingsuch information will be apparent to one familiar with the art, somemeans being optimally non-redundant, some optimal for humanspecification, some optimal for space, some optimal for processing time,and some optimal for yet other purposes. In said augmentation, eachdependency can be specified in such a manner as to uniquely identifyboth transactions and the steps of those transactions. For example, eachtransaction might be given a unique transaction identifier and each stepan identifier unique within that transaction. Then, a dependencyspecification such as “(A.1, A.2), (A.1, B.3)” can be given at any timeafter the referenced transaction steps are specified.

In the event that a transaction definition is not augmented withdependencies among the steps and among a group of transactions,dependencies can be determined automatically or semi-automatically via,for example, the following methods:

if two transaction steps or groups of steps do not touch the sameresources, they are independent (although they may be transitivelydependent);

if two transaction steps or groups of steps have same ultimate resultirrespective of the order of application (that is, if they arecommutative), they are independent;

if two transaction steps or groups of steps have no applicableconsistency conditions in common, they are independent; or,

if two transaction steps or groups of steps cannot both violate at leastone consistency condition, thereby producing the same error, they areindependent.

Transaction steps or consistent groups which execute within a singleapplication instance and that are independent may be restructured.Consistent groups or transactions that are independent (that is, all ofthe steps in one consistent group or transaction are independent of allthe steps in the other consistent group or transaction, respectively)may be restructured even if they run in separate application instancesand with transaction isolation guaranteed by the system. As is wellknown, this fact enables the execution of such transactions without theoverhead of locking (used to enforce isolation in pessimisticconcurrency control) or the overhead of conflict detection mechanisms(used in optimistic concurrency control), thereby further optimizing theperformance of transaction processing, so long as only mutuallyindependent transactions are executed concurrently.

By extension, transactions that do not meet the mutual independencecriteria may be simultaneously scheduled using some other method (the‘local method’) to maintain concurrency and isolation (such as two-phaselocking) provided that every collection of mutually independenttransactions (or consistent groups) is isolated from each other and fromall other transactions. Insofar as the ‘local method’ is concerned, eachcollection of mutually independent transactions (or consistent groups)is made to appear as a single transaction. For example, if two-phaselocking is the local method, locks are maintained for each collection ofmutually independent transactions (or consistent groups) as if they werea single transaction, and transactions (or consistent groups) within thecollection read through all locks held by the collection but respectlocks held by transactions outside the collection.

The method of dependency-based concurrency optimization may be extendedwith the concept of “conflict classes.” Transactions are divided intoclasses, and possibly belong to multiple classes. Each pair of classesis specified as being either dependent (potentially in conflict) orindependent (impossible to ever be in conflict). If a transaction is notyet classified, it is evaluated to determine with which classes itpotentially conflicts and with which it is independent. To belong to aclass, the transaction must be potentially in conflict with everytransaction in the said class. If the transaction matches the dependencyand independency properties of the said class with respect to all otherclasses, it belongs to the said class; otherwise, it belongs to adifferent class. If no existing class meets these criteria, thetransaction belongs to a new class. Transaction definitions are uniquelyidentified, and are recorded as belonging to a particular class based onthat transaction identifier. Transactions are invoked by transactionidentifier. Whenever a transaction request is received with such anidentifier (or some means which permits association with such anidentifier), the scheduler determines the classes to which thetransaction belongs and from this information obtains the list ofclasses with which it is potentially in conflict (the dependentclasses). It then checks to see if any running transaction belongs toone of the dependent classes. If such a transaction is running, thedesired transaction is either deferred until that transaction completes,or another method of guaranteeing transaction isolation is used. If nosuch transaction is running, the desired transaction is executed.

Refinements of the technique are possible. In one embodiment, forexample, the classifying of transactions is done at the transaction steplevel and it is then possible to schedule concurrent transaction stepsfrom multiple transactions (as will be apparent to anyone familiar withthe art). In this embodiment, each subsequent step of a transaction mustbe shown to be independent of all preceding and current steps of runningtransactions before it is permitted to run. In another embodiment,concurrent transactions proceed step by step until a possible conflictbased on classes is detected, at which point one transaction is eitherdeferred until the other transaction completes or else is rolled back toa consistency point (possibly the beginning of the transaction) andeither resubmitted or a corrective transaction is submitted.

6 Combined Implementation (the Preferred Embodiment)

The preferred embodiment of the present invention in implemented insoftware (the ‘Adaptive Transaction Manager’) on a distributed networkof computers with a distributed database management system implementinga business process involving multiple business entities. The businessprocess consists of a large number of transactions, tasks, activities,and other units of work, many of them complex and some of them of anad-hoc nature such that the entirety of their constituent steps oroperations are not knowable in advance. The Adaptive Transaction Mangerautomatically identifies dependencies, consistency points, consistentgroups, and redundant consistent groups. If a deadlock or other failureis encountered, the Adaptive Transaction Manager automatically recoversby rollback to a consistency point that eliminates the source of theerror and then attempts to redo the work (it aborts only after retryinga pre-determined number of times or after a pre-determined amount ofwork). Redundant consistent groups are eliminated using transactionrelaying, since two concurrent transactions having the same consistentgroup may share the work done by that group. A combination oftransaction relaying, restructuring, and corrective transactions areused to eliminate most distributed transactions. When an error occurs,the error is classified according to whether it represents a transient,semantic, or hardware failure. If it is transient or hardware,transaction rollback to the most recent consistency point is invoked andthe intervening work is resubmitted. This sequence is repeated for up toa fixed number of times and possibly with an intervening time delay(both determined by the type of error) until the transaction eithersucceeds or the number is surpassed. If the number of repetitions issurpassed, transaction rollback to an earlier consistency point isinvoked and the work resubmitted. This process continues until thesystem recovers. If the error is semantic, the Adaptive TransactionManager determines which prior consistency point will provide a startingpoint of an alternate path within the business process that best leadsto an acceptable state, preferably with the least effort and best chanceof successful completion. It then invokes one or more correctivetransactions that together are functionally and semantically equivalentto that alternate path. The Adaptive Transaction Manager optimizes forefficiency through the use of lookahead resource management anddependency-based concurrency optimization, restructuring transactionsand consistent groups where possible to minimize overhead (for example,due to locking).

The Adaptive Transaction Manager can rollback the system to aconsistency point if there is an error that cannot be compensated for,if the cost of the compensation exceeds the value gained by thecorrection, or for other similar reasons. In a further enhancement ofthe preferred embodiment, the system record of the data and resourcesused in each transaction is used to hand off responsibility and controlover the data and resources from one transaction to the next as eachcompletes, that is, as each reaches a consistency point. Only those dataand resources which are fully and correctly ‘transitioned’ are handedoff, allowing auditable and non-interfering distribution or partialbranching to occur without the hazard of contaminating data orprocesses, and without incurring the overhead of both multiple copies ofdata and tracking the current ‘correct’ subset. In this sense, atransaction that has reached a partial state which is correct for allother transactions for a subset of the data and resources it uses alone,can commit and release those data and resources rather than continue totie them up needlessly.

Under the preferred embodiment of the present invention, the AdaptiveTransaction Manager actively uses dependencies to detect whichtransaction needs to own what data and what resources at each particularstep along a complex transaction, and minimizes duplication and lockingof the same. Moreover, variable exploration of alternatives becomesfeasible by implementing, in a further extension of the preferredembodiment, alternative methodologies for controlling such data andresources. For example, voting rules may be used (three processors totwo), hierarchical rules (home office database overrules local branch),or heuristically derived rules peculiar to a particular business oroperation.

In the preferred embodiment of the invention, whenever a compensating orcorrective transaction is needed, a full audit trail of the originalacceptable state, mistaken state, compensating or correctivetransaction, and final acceptable state is maintained. In a furtherextension of the preferred embodiment the log of individual error auditsis analyzed to identify recurring problems and suggest where additionalpreventative efforts be taken, including additional correctivetransactions.

In the preferred embodiment, for each predetermined transaction theanticipated consistency category of the final state is registered withthe Adaptive Transaction Manager, classes of errors are associated withcorresponding classes of recovery methods (including compensating orcorrective transactions), and the Adaptive Transaction Managerdetermines which compensating or corrective transactions to execute sothat recovery to an acceptable state can take place automatically andconsistently. Additionally, the Adaptive Transaction Manager maintains alog of ‘acceptable’ states as transactions are processed withoutuncompensated errors. The extent to which the Adaptive TransactionManager allows transitions to become permanent depends now more on thelevel of accuracy which the business feels comfortable with than uponthe static limitations of record-keeping.

Extensions to the preferred embodiment would make the system moreapplicable for particular business purposes including telecommunicationsrerouting; inventory management for retail or distributional operationsthat encounter spillage, wastage, or theft; electronic funds transfermessage repair; financial transactions affected by governmental fiats;and billing systems reflecting or affected by collection processes,debtor failures, and bankruptcies.

In a further extension of the present invention this method is appliedto a model for negotiations allowing exploration of hypothetical orproposed solutions, and their consequences and costs, to be evaluated.

In a further extension of the present invention this method is appliedto asset exchanges where the parties do not have an initial agreement asto the value of the particular elements, or even agreement as to theparticular elements that are the subject of the proposed exchange,beforehand, to allow intermediate positions to be evaluated and thecosts and benefits of concessions and tradeoffs to be explicitlyassessed.

However, the scope of this invention includes any combination of theelements from the different embodiments disclosed in this specification,and is not limited to the specifics of the preferred embodiment or anyof the alternative embodiments mentioned above. Individual userconfigurations and embodiments of this invention may contain all, orless than all, of the elements disclosed in the specification accordingto the needs and desires of that user. The claims stated herein shouldbe read as including those elements which are not necessary to theinvention yet are in the prior art and may be necessary to the overallfunction of that particular claim, and should be read as including, tothe maximum extent permissible by law, known functional equivalents tothe elements disclosed in the specification, even though thosefunctional equivalents are not exhaustively detailed herein.

182. A method for efficient transaction processing and error managementimplemented as an Adaptive Transaction Manager (‘ATM’), the method beingextensible to multiple business entities (related or independent) andextensible to complex transactions, said method comprising: acoordinated set of sub-methods, extensible to and instantiable upon adistributed network of computers, each particular sub-method and any setthereof also being usable by either a unitary database management systemor a distributed database management system, said sub-methods comprisingsteps for: implementing transaction consistency points; implementingtransaction relaying; implementing corrective transactions; implementinglookahead-based resource management; and, implementing dependency-basedconcurrency optimization.
 183. A general-purpose computer incorporatingspecific hardware and software for manipulating at least one databasewhen processing at least one transaction, wherein said specific hardwareand software comprise: means for implementing transaction consistencypoints; means for implementing transaction relaying; means forimplementing corrective transactions; means for implementinglookahead-based resource management; and, means for implementingdependency-based concurrency optimization.
 184. A general-purposecomputer that includes software, dynamic and stable memory, and logicalprocessing hardware, programmed for manipulating at least one databasewhen processing at least one transaction and manipulating steps in atleast one transaction, comprising: means for manipulating the software,logical processing hardware, and dynamic and stable memory, to designatea set of current data values for any part of the data in the databaseand any particular step in a transaction, as a transaction consistencypoint; means for manipulating the software, logical processing hardware,and dynamic and stable memory, to select at least one set of currentdata values for any part of the data in the database, and to manipulateany set of particular steps in at least two transactions, to effectuatetransaction relaying; means for manipulating the software, logicalprocessing hardware, and dynamic and stable memory, upon detection of anerror condition, to selectively effectuate implementation of at leastone corrective transaction; means for manipulating the software, logicalprocessing hardware, and dynamic and stable memory, to automaticallyimplement optimization of the use of said logical processing hardwareand dynamic and stable memory through altering the steps in a definitionof said transaction using lookahead-based resource management; and,means for manipulating the software, logical processing hardware, anddynamic and stable memory, to automatically manipulate the steps of saidtransaction and software, and automatically implement optimization ofsaid logical processing hardware and dynamic and stable memory, for theprocessing of said transaction, through implementation ofdependency-based concurrency optimization.
 185. A method as in claim 182wherein the ATM is applied to at least one member of a set of businessproblems comprising telecommunications, retail, inventory, fundstransfer, message repair, financial transactions, government fiats,negotiation, asset exchanges, distributed business transactions,electronic commerce, business process automation, business-to-businessexchanges, business integration, insurance, and billing.
 186. A methodas in claim 183 further comprising using transaction relaying toimplement non-flat transactions.
 187. A computerized method for bothefficient transaction processing implemented as a defining feature of anAdaptive Transaction Manager (‘ATM’) and for determining a firsttransaction, said method comprising: (a) identifying a first set ofconsistency conditions on a first set of data elements, comprising atleast a first consistency condition; (b) identifying a second set ofconsistency conditions on a second set of data elements, comprising atleast a second consistency condition, without requiring the second setof consistency conditions to be distinct from the first set ofconsistency conditions; (c) associating the first set of consistencyconditions with a first set of operations comprising at least oneoperation on at least one element from the combined first set of dataelements and second set of data elements, the first set of operationshaving an initial state and a final state, said final state being:represented by the second set of data elements; required to satisfy thesecond set of consistency conditions; reached upon successfultermination of the first set of operations; consistent with the secondset of consistency conditions; computed from both the initial state andany parameters; and, resulting from unexceptional execution; (d)specifying the initial state of the first set of operations as being thefirst transaction's initial state; (e) performing at least one operationof the first set of operations; (f) specifying the final state of thefirst set of operations as being the first transaction's final state;and, (g) committing the first transaction automatically afterdetermining that the first transaction's final state satisfies thesecond set of consistency conditions.
 188. A method as in claim 187 forimplementing a first implicit transaction as the first transactionwherein an explicit transaction directive to begin the first transactiondoes not precede any operation in the first set of operations as anypart of the first transaction's initial state, an explicit transactiondirective to end the first transaction does not follow the first set ofoperations' final operation as any part of the first transaction's finalstate, and every operation necessary to initiate and to end the firsttransaction is performed automatically.
 189. A method as in claim 187further comprising guaranteeing that the first transaction satisfies atleast one member of a set of transaction properties comprisingatomicity, consistency, isolation, and durability.
 190. A method as inclaim 188 wherein the step of committing the first implicit transactionguarantees the property of atomicity by performing the step if and onlyif each and every operation of the first set of operations is bothsuccessful and representable as a connected set of state transitionsresulting from the first set of operations, said first set of operationsbeing fully determined at the first transaction's final state.
 191. Amethod as in claim 187 further comprising guaranteeing at leastpartially the property of consistency by: defining a class ofconsistency conditions prior to reaching the first transaction's initialstate; and, determining that the first set of consistency conditionsbelongs to the class of consistency conditions.
 192. A method as inclaim 188 further comprising guaranteeing at least partially theproperty of consistency by: defining a class of consistency conditionsprior to the first implicit transaction's final state being reached;and, determining that the second set of consistency conditions belongsto the class of consistency conditions.
 193. A method as in claim 187wherein the step of committing the first transaction guarantees theproperty of consistency by performing the step if and only if: the firsttransaction's initial state is determined to satisfy some set ofconsistency conditions belonging to a first class of consistencyconditions defined prior to the first transaction reaching the firsttransaction's initial state; and, the first transaction's final state isdetermined to satisfy some set of consistency conditions belonging to asecond class of consistency conditions defined prior to the firsttransaction reaching the first transaction's final state; wherein thefirst class of consistency conditions and the second class ofconsistency conditions may be one and the same class of consistencyconditions.
 194. A method as in claim 187 further comprisingguaranteeing the property of isolation and incorporating the steps of:identifying a first sharable resource as being a portion of a firstintermediate state of the first transaction; identifying at least asecond transaction that has not terminated; determining that the portionof the first intermediate state of the first transaction is notinconsistent, non-contradictory, and non-conflicting with the secondtransaction; controlling sharing of the first sharable resource amongthe first transaction and the second transaction, including access bythe second transaction to the first sharable resource, based on the stepof determining.
 195. A method as in claim 194 wherein the step ofdetermining further comprises: identifying a common history of the firstsharable resource that is consistent with both the first transaction'sdefinition and history and with the second transaction's definition andhistory, said common history being functionally equivalent to apartially ordered set of states and state transitions of the firstsharable resource, each said state transition corresponding to anoperation capable of generating that state transition from one of saidstates.
 196. A method as in claim 194 wherein the step of controllingfurther comprises: precluding sharing of the first sharable resourceamong the first transaction and the second transaction when both thefirst transaction and the second transaction could not have accessed acommon initial state of the first sharable resource given the knownstates of the first sharable resource, when those states existed andwhen the first transaction and the second transaction began.
 197. Amethod as in claim 194 further comprising: rewriting at least onerewritable operation of any of the first transaction, the secondtransaction, and a second implicit transaction so as to be recorded ashaving been executed in the context of a different transaction;executing the at least one rewritable operation in the context of thedifferent transaction; and, using the result of said step of rewritingas component in a recoverable record of all operations and stateslogically necessary to maintain the common history of the first sharableresource.
 198. A method as in claim 197 wherein the differenttransaction is a new implicit transaction and the original,pre-rewriting result of said rewritable operation is then not committedin the context of any of the first transaction, the second transaction,and the second implicit transaction.
 199. A method as in claim 187wherein the property of durability is guaranteed by ensuring that thefirst transaction's final state is recoverable insofar as the firsttransaction's final state has any effect on transaction history at thetime of recovery.
 200. A method as in claim 199 wherein the firsttransaction's final state is recovered by recomputing the firsttransaction's final state.
 201. A method as in claim 187 furthercomprising: selecting a first alternative final state from among a setof alternative final states; basing the selection on at least one memberof a set of acceptability critera comprising any of measures of risk,measures of opportunity, measures of cost, and measures of benefit; and,ensuring that the final operation within the first transaction willyield the selected alternative final state.
 202. A method as in claim187 further comprising: stating a first goal; deriving a firstgoal-oriented transaction's definition by selecting from a second set ofoperations a sequence of transitions from the first goal-orientedtransaction's initial state to the first goal-oriented transaction'sfinal state; and, determining that said first goal-orientedtransaction's final state satisfies the first goal; and, executing thefirst goal-oriented transaction.
 203. A method as in claim 187 furthercomprising maintaining in durable storage at least one member of a setof audit log enhancements, said set of audit log enhancements comprisingidentification of an acceptable state, identification of a mistakenstate, identification of a compensating transaction, identification of acorrective transaction, and identification of a final acceptable state.204. A method as in claim 188 further comprising: maintaining in durablestorage at least one member of a set of audit log enhancementscomprising identification of an acceptable state, identification of amistaken state, identification of a compensating transaction,identification of a corrective transaction, identification of a finalacceptable state, and, identification of the first implicit transaction.205. A method as in claim 203 wherein the at least one member of a setof audit log enhancements is maintained in an audit trail.
 206. A methodas in claim 204 wherein the at least one member of a set of audit logenhancements is maintained in a transaction log.
 207. A general-purposecomputer incorporating specific hardware and software for manipulatingat least one database when processing at least one transaction, whereinsaid specific hardware and software comprise: Parser means for any ofthe set of interpreting and compiling transaction definitions;Repository means for storing, retrieving, and modifying elements oftransaction metadata, including at least one of transaction definitions,consistency conditions, sets of consistency conditions, classes ofconsistency conditions, dependencies, publication/subscriptiondefinitions, audit log enhancements, and transaction resources;Repository Manager means for coordinating all stored information,including dependencies, transaction definitions, associations, sets ofconsistency conditions, classes of consistency conditions, audit logenhancements, consistency categories, and subscriptions; ConsistencyManager means for detecting and verifying transaction consistencypoints, verifying consistency of transaction and resource histories, anddefining implicit transactions; Dependency Manager means forinterpreting dependency directives, detecting dependencies, determiningtransaction and resource histories, deriving sequences of operations toattain specified states as goals, identifying consistent groups based ondependencies and asserting the corresponding consistency points;Resource Manager means for implementing transaction relaying andlookahead-based resource management, accessing and updating resources,allocation management, scheduling, resource isolation, maintainingcache, maintaining other resource constraints, detecting resourcerequirements, implementing resource management directives, and providingresource management directives to the Restructuring Processor;Correction Processor means for implementing corrective transactions,correlating abnormal conditions and consistency points, and by any setof using direct association and using any of consistency conditioncategories and classes of consistency conditions, performing any ofdiscovering, optimally selecting, and creating a corrective transactionand submitting the corrective transaction to the Execution Manager;Restructuring Processor means for rewriting transactions; IsolationManager means for guaranteeing isolation of resources and transactions;Publication/Subscription Manager means for processing publication andsubscription definitions, detecting publication events, and notifyingappropriate subscribers of publication events; Execution Manager meansfor processing transactions, allocating and deallocating transactioncontexts, passing directives and instructions to the appropriate ATMcomponents, and orchestrating transaction scheduling, commit, rollback,and rollforward; and, Resource Scheduler means for implementingdependency-based concurrency optimization.